2,838 research outputs found

    L'amas verbal au cœur d'une modélisation topologique de l'ordre des mots

    Get PDF
    Version courte présentée aux Journées de la syntaxe "ordre des mots dans la phrase française" tenue à Bordeaux (novembre 2004)International audienc

    The relation between dependency distance and frequency

    Get PDF
    International audienceThis present pilot study investigates the relationship between dependency distance and frequency based on the analysis of an English dependency treebank. The preliminary result shows that there is a non-linear relation between dependency distance and frequency. This relation between them can be further formalized as a power law function which can be used to predict the distribution of dependency distance in a treebank

    Non-constituent coordination and other coordinative constructions as Dependency Graphs

    Get PDF
    International audienceThis paper proposes a new dependency-based analysis of coordination that generalizes over existing analyses by combining symmetrical and asymmetrical analyses of coordination into a DAG structure. The new joint structure is shown to be theoretically grounded in the notion of connections between words just as the formal definition of other types of dependencies. Beside formalizations of shared dependents (including right-node raising), paradigmatic adverbs, and embedded coordinations , a completely new formalization of non-constituent coordination is proposed

    Speaking in piles: Paradigmatic annotation of French spoken corpus

    Get PDF
    International audienceThis article describes a central part of the syntactic schemes that are currently used in the ongoing annotation of a French spoken corpus. Based on the Aix School grid analysis of spoken French, the notion of « pile » is introduced, allowing for an elegant description of various paradigmatic phenomena like disfluency, reformulation, apposition, instanciation, including question-answering and colon effect, and different types of coordination. Piles naturally complete dependency annotations by modeling non-functional relations between phrases

    Rediscovering Greenberg's Word Order Universals in UD

    Get PDF
    International audienceThis paper discusses an empirical refoundation of selected Greenbergian word order univer-sals based on a data analysis of the Universal Dependencies project. The nature of the data we work on allows us to extract rich details for testing well-known typological universals and constitutes therefore a valuable basis for validating Greenberg's universals. Our results show that we can refine some Greenbergian universals in a more empirical and accurate way by means of a data-driven typological analysis

    A Surface-Syntactic UD Treebank for Naija

    Get PDF
    International audienceThis paper presents a syntactic treebank for spoken Naija, an English pidgincreole, which is rapidly spreading across Nigeria. The syntactic annotation is developed in the Surface-Syntactic Universal Dependency annotation scheme (SUD) (Gerdes et al., 2018) and automatically converted into UD. We present the workflow of the treebank development for this under-resourced language. A crucial step in the syntactic analysis of a spoken language consists in manually adding a markup onto the transcription, indicating the segmentation into major syntactic units and their internal structure. We show that this so-called "macrosyntactic" markup improves parsing results. We also study some iconic syntactic phenomena that clearly distinguish Naija from English

    Technological taxonomies for hypernym and hyponym retrieval in patent texts

    Full text link
    This paper presents an automatic approach to creating taxonomies of technical terms based on the Cooperative Patent Classification (CPC). The resulting taxonomy contains about 170k nodes in 9 separate technological branches and is freely available. We also show that a Text-to-Text Transfer Transformer (T5) model can be fine-tuned to generate hypernyms and hyponyms with relatively high precision, confirming the manually assessed quality of the resource. The T5 model opens the taxonomy to any new technological terms for which a hypernym can be generated, thus making the resource updateable with new terms, an essential feature for the constantly evolving field of technological terminology.Comment: ToTh 2022 - Terminology & Ontology: Theories and applications, Jun 2022, Chamb{\'e}ry, Franc

    Depends on what the French say: Spoken corpus annotation with and beyond syntactic function

    No full text
    International audienceWe present a syntactic annotation scheme for spoken French that is currently used in the Rhapsodie project. This annotation is dependency- based and includes coordination and disfluency as analogously encoded types of paradigmatic phenomena. Furthermore, we attempt a thorough definition of the discourse units required by the systematic annotation of other phenomena beyond usual sentence boundaries, which are typical for spoken language. This includes so called "macrosyntactic" phenomena such as dislocation, parataxis, insertions, grafts, and epexegesis

    Trois schémas d’annotation syntaxique en dépendance pour un même corpus de français oral : le cas de la macrosyntaxe

    Get PDF
    International audienceWe present three dependency annotation schemes applied to the same corpus of spoken French: Rhapsodie, Orféo and UD (Universal Dependencies). The first two are distributed, the latter is work in progress. We emphasize the annotation of “macrosyntactic” phenomena, i.e.relations in the utterance that are not part of the sub-categorization. We contrast the three schemes and propose a fourth scheme that subsumes the three others.Nous présentons trois schémas d’annotation appliqués à un même corpus de français oral : Rhapsodie, Orféo et UD (Universal Dependencies). Les deux premiers sont diffusés et le troisième est en cours. Nous mettons ici l’accent s ur la macrosyntaxe, c’est-à-dire sur les relations au sein d’un énoncé qui ne relèvent pas de la rection. Nous contrastons les trois schémas et proposons un quatrième schéma qui subsume les trois autres

    Volume 1: Modélisation, unités, structures

    Get PDF
    Conçu comme une introduction générale à la syntaxe, cet ouvrage présente les notions de base nécessaires à une étude de la combinaison des unités lexicales et grammaticales au sein d’un énoncé. Sans se placer dans un cadre préconçu, l’ouvrage étudie les différentes possibilités pour la représentation des structures syntaxiques, en fonction des principes généraux et des critères particuliers retenus. Élaboré avec l’objectif de fournir une base pour l’enseignement de la syntaxe à l’université, cet ouvrage souhaite montrer qu’on peut dégager de manière méthodique les propriétés des langues et mettre de l’ordre dans la forêt vierge que constitue chaque langue. Il est divisé en trois parties : comment élaborer le modèle d’une langue, comment déterminer les unités de base de la langue en fonction de leur sens, forme et combinatoire, comment définir et représenter les différents modes d’organisation des unités. Cette dernière partie présente une abondance de diagrammes syntaxiques de diverses natures. L’ouvrage est découpé en de petites sections, alternant le contenu principal avec des éclairages, des notes historiques, des élaborations plus formelles, des exemples linguistiques dans diverses langues, des propositions de lectures additionnelles et des exercices avec des éléments de correction
    • …
    corecore